-
Notifications
You must be signed in to change notification settings - Fork 9.1k
fix: Prevent excessive Copilot premium request consumption #8721
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
fix: Prevent excessive Copilot premium request consumption #8721
Conversation
|
The following comment was made by an LLM, it may be inaccurate: No duplicate PRs found |
2107784 to
1cd138b
Compare
|
@thdxr Your opencode work is very inspiring. Would appreciate if the team can review my PR. |
1cd138b to
16042a3
Compare
|
Also please let me know if you need the isolated docker setup I use for my testing and I will open another PR for that so it's handy for others. |
Fixes the X-Initiator header detection to correctly identify agent-initiated vs user-initiated requests. The previous logic only checked if the last message role was not "user", which failed for synthetic user messages created by message-v2.ts for tool attachments, compactions, and subtasks. New detection strategy: 1. If ANY assistant/tool message exists -> agent (continuation) 2. If multiple user messages exist -> agent (multi-turn) 3. If user message matches synthetic patterns -> agent This ensures only the first real user message consumes a premium request. Fixes anomalyco#8030 Fixes anomalyco#8067
16042a3 to
5414c79
Compare
Addresses @bowmanjd feedback - removed the flawed 'multiple user messages' rule. Real user follow-ups now correctly charge premium, matching Copilot CLI behavior. Detection logic: 1. If ANY assistant/tool message exists → agent 2. If LAST user message is synthetic → agent 3. Otherwise → user (charges premium)
49f6a4e to
7bc7e18
Compare
|
Thanks for your contribution! This PR doesn't have a linked issue. All PRs must reference an existing issue. Please:
See CONTRIBUTING.md for details. |
|
Thanks for this! I've experienced this issue too |
|
This is good.. latest fix made Opencode unbearable with Copilot license. Actually, fixing this faster may allow people to raise less issues since more people are moving over to Opencode with the Copilot integration announcement. Just wondering: would the synthetic messages have to be maintained? Assuming any change in Copilot's output messages would affect this. |
|
@FrescoFlacko Great question! The synthetic patterns are generated by OpenCode itself, not by Copilot:
So they won't break if Copilot changes - they're internal strings in this repo. Maintenance risk is low, but not zero. If someone changes these strings in Long-term alternatives (not in this PR)Option A: Metadata flag
Option B: Tool outputs for guidance (per @rekram1-node's suggestion)
This PR is a minimal, focused fix for the immediate issue. Happy to adjust if maintainers prefer a different direction. |
This solution has weak points:
So this is fragile and "hackable".
This is the better way. something like |
|
@aefmind You raise valid points. Let me address them: Maintenance risk: The strings are internal to this repo (not from Copilot), so changes would be in the same codebase. But you're right there's no compile-time enforcement. Hackability: Fair point, though users would need to know the exact patterns, and they'd only be gaming their own billing (not a security issue). On the metadata approach: You're right that it's architecturally cleaner. I investigated and found:
Implementing metadata-based detection would require changes to Path forward: This PR ships a minimal fix for the urgent #8030 issue. I'll open a follow-up issue for "migrate to metadata-based synthetic detection" as a cleaner long-term solution. |
|
GitHub Copilot treats every explicit user input as a new premium request, even when tools or assistant messages were used earlier in the session. The previous detectAgent rule classified a request as agent-initiated if the conversation history contained any assistant or tool message. Because message history is cumulative, this causes all subsequent user inputs to be permanently marked as "agent" after the first tool invocation. This behavior is incorrect:
As a result, X-Initiator is incorrectly set to "agent" for normal user follow-up requests, misrepresenting the request semantics to GitHub Copilot. Affected code : // Rule 1: If any assistant/tool message exists, this is a continuation
const hasNonUser = messages.some((msg: any) => ["assistant", "tool"].includes(msg.role))
if (hasNonUser) return true |
|
I'm discussing w/ copilot team if subagent invocations are okay to not count against quota, it seems like the line is blurry with compliance |
|
@rekram1-node we will wait for the response. In the case of subagents in Copilot VS Code, they aren't counted against quota, but they use the same model as the main agent. The edge case here could be when subagents rely on different models with different cost or even from different providers. |
|
@rekram1-node In VS Code, sub-agents always run on the same model as the primary agent, so the multiplier is fixed and predictable. In OpenCode, sub-agents can be configured with different models. This creates a scenario where a user could run the primary agent on a low-multiplier model and delegate work to a sub-agent using a higher-multiplier model, effectively bypassing quota constraints. That model mismatch seems like the real compliance blocker. If OpenCode enforced that sub-agents cannot use a model with a higher multiplier than the primary agent (or forced a fallback to the same model), free sub-agent quota would be easier to justify — though it adds complexity. This is the only concrete blocker I see to aligning OpenCode sub-agents with Copilot’s quota semantics. |
|
additional context: Copilot Chat does support running sub-agents with a different model, gated behind the chat.customAgentInSubagent.enabled setting. This is discussed in microsoft/vscode#275855 |
This would never happne unless the user has explicitly configured opencode to do this or is using some external plugin, at worst the same model will be used for subagents, at best a cheaper one |
|
hi ! is there any workaround to use opencode with copilot api in the meantime? currently, the tool is burning the premium requests quota much too fast. thanks a lot and best regards. |
Take a try 1.1.13, use haiku for cheapest test. |
|
@ananas-viber With this modification, entering user prompts multiple times will only consume quota once. However, in the Copilot extension, each time you enter a user prompt, it consumes quota. |
@caozhiyuan You're correct - this is the same issue @nsoufian identified earlier. Rule 1's "sticky" approach marks all subsequent user messages as The fix would be to only check the last message role (like pi-mono and codecompanion do) rather than scanning the entire history. However, since @rekram1-node is implementing source-layer fixes directly, this PR may be superseded. I'll leave it open for now in case the detection approach is still wanted as a fallback. |
|
@ananas-viber @rekram1-node what is status of this? |
00637c0 to
71e0ba2
Compare
f1ae801 to
08fa7f7
Compare
Summary
Fixes #8030
Builds on #8393 — Adds synthetic message detection to prevent excessive premium requests.
The official plugin (#8393) uses
last?.role !== "user"which fails for synthetic user messages created by OpenCode (compaction, tool attachments, subtasks). This PR detects those synthetic messages and correctly marks them as agent-initiated.How VSCode Copilot Chat Handles This
From
microsoft/vscode-copilot-chat@main:VSCode uses explicit flags to mark tool loops, continuations, and subagents as
agent. OpenCode doesn't have these flags, so we infer the same intent by checking message history.Detection Logic
useragentagentRules:
agent(matches VSCode's tool loop behavior)agentuserSynthetic patterns detected:
"What did we do so far?"(compaction)"Tool X returned an attachment:"(tool results)"The following tool was executed by the user"(subtasks)Why This Approach
OpenCode creates synthetic
usermessages for internal operations that VSCode handles differently. Since we can't use VSCode's explicit flags (iterationNumber,isContinuation,subAgentInvocationId), we infer agent status from:This aligns with third-party Copilot clients (litellm, copilot-api, crush) that use the same "Sticky" inference approach.
Changes
copilot.ts:isSynthetic()andhasSyntheticContent()for pattern matchingdetectAgent()with history-based inferencedetectVision()for both Completions and Responses APIcopilot.test.ts(new):Verification
Related